Reinforcement Learning with Preferences

نویسندگان

  • Johannes Feldmaier
  • Hao Shen
چکیده

In this work, we propose a framework of learning with preferences, which combines some neurophysiological findings, prospect theory, and the classic reinforcement learning mechanism. Specifically, we extend the state representation of reinforcement learning with a multi-dimensional preference model controlled by an external state. This external state is designed to be independent from the reinforcement learning process so that it can be controlled by an external process simulating the knowledge and experience of an agent while preserving all major properties of reinforcement learning. Finally, numerical experiments show that our proposed method is capable to learn different preferences in a manner sensitive to the agent’s level of experience.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Preference elicitation and inverse reinforcement learning

We state the problem of inverse reinforcement learning in terms of preference elicitation, resulting in a principled (Bayesian) statistical formulation. This generalises previous work on Bayesian inverse reinforcement learning and allows us to obtain a posterior distribution on the agent’s preferences, policy and optionally, the obtained reward sequence, from observations. We examine the relati...

متن کامل

Preference Elicitation and Inverse Reinforcement Learning

We state the problem of inverse reinforcement learning in terms of preference elicitation, resulting in a principled (Bayesian) statistical formulation. This generalises previous work on Bayesian inverse reinforcement learning and allows us to obtain a posterior distribution on the agent’s preferences, policy and optionally, the obtained reward sequence, from observations. We examine the relati...

متن کامل

Learning User Preferences in Ubiquitous Systems: A User Study and a Reinforcement Learning Approach

Our study concerns a virtual assistant, proposing services to the user based on its current perceived activity and situation (ambient intelligence). Instead of asking the user to define his preferences, we acquire them automatically using a reinforcement learning approach. Experiments showed that our system succeeded the learning of user preferences. In order to validate the relevance and usabi...

متن کامل

Exploitation of User’s Preferences in Reinforcement Learning Decision Support Systems

A system called COLMAS (COordination Learning in Multi-Agent System) has been developed to investigate how the integration of realistic geosimulation and reinforcement learning might support a decision-maker in the context of cooperative patrolling. COLMAS is a model-driven automated decision support system combining geosimulation and reinforcement learning to compute near optimal solutions. Bu...

متن کامل

Learning from Trajectory-Based Action Preferences

Conventional reinforcement learning algorithms depend on the availability of a numerical feedback signal. In many domains, this is not readily available and, in fact, constitutes an additional parameter of the problem setting. As a consequence, a fair amount of engineering is required in order to find a reasonable configuration of reward signal and algorithm parameters that facilitates an effic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015